Chinese News Text Classification Method via Key Feature Enhancement
نویسندگان
چکیده
(1) Background: Chinese news text is a popular form of media communication, which can be seen everywhere in China. classification an important direction natural language processing (NLP). How to use high-quality technology help humans efficiently organize and manage the massive amount web urgent problem solved. It noted that existing deep learning methods rely on large-scale tagged corpus for tasks this model poorly interpretable because size large. (2) Methods: To solve above problems, paper proposes method based key feature enhancement named KFE-CNN. effectively expand semantic information features enhance sample data then combine zero–one binary vector representation transform into vectors input them CNN training implementation, thus improving interpretability compressing model. (3) Results: The experimental results show our significantly improve overall performance average accuracy F1-score THUCNews subset public dataset reached 97.84% 98%. (4) Conclusions: fully proved effectiveness KFE-CNN task it also demonstrates performance.
منابع مشابه
An Improved CHI Feature Selection Method for Chinese Text Classification
We Proposed a kind of feature selection method named ICHI based on improved CHI. Through the classified experiment ,the result showsthat feature extraction effect of CHI method is better than the traditional CHI’s when them is used to select features in SVM and KNN classification, and the ICHI method can enhance theaccuracy in text classification and it is fittedto extract feather.
متن کاملA Novel One Sided Feature Selection Method for Imbalanced Text Classification
The imbalance data can be seen in various areas such as text classification, credit card fraud detection, risk management, web page classification, image classification, medical diagnosis/monitoring, and biological data analysis. The classification algorithms have more tendencies to the large class and might even deal with the minority class data as the outlier data. The text data is one of t...
متن کاملEmotion Classification of Chinese Microblog Text via Fusion of BoW and eVector Feature Representations
Sentiment Analysis has been a hot research topic in recent years. Emotion classification is more detailed sentiment analysis which cares about more than the polarity of sentiment. In this paper, we present our system of emotion analysis for the Sina Weibo texts on both the document and sentence level, which detects whether a text is sentimental and further decides which emotion classes it conve...
متن کاملCategory Discrimination Based Feature Selection Algorithm in Chinese Text Classification
How to improve the classification precision is a major issue in the field of Chinese text classification. The tf-idf algorithm is a classic and widely-used feature selection algorithm based on VSM. But the traditional tf-idf algorithm neglects the feature term’s distribution inside category and among categories, which causes many unreasonable selective results. This paper makes an improvement t...
متن کاملFeature Selection on Chinese Text Classification Using Character N-Grams
In this paper, we perform Chinese text classification using n-gram text representation on TanCorp which is a new large corpus special for Chinese text classification more than 14,000 texts divided into 12 classes. We use different n-gram feature (1-, 2-grams or 1-, 2-, 3-grams) to represent documents. Different feature weights (absolute text frequency, relative text frequency, absolute n-gram f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied sciences
سال: 2023
ISSN: ['2076-3417']
DOI: https://doi.org/10.3390/app13095399